Overview

Brought to you by YData

Dataset statistics

Number of variables29
Number of observations121090
Missing cells36
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory26.8 MiB
Average record size in memory232.0 B

Variable types

Numeric4
Text10
DateTime3
Categorical12

Alerts

# Sh has constant value "1" Constant
DepartmentCodeDescription is highly overall correlated with Group and 3 other fieldsHigh correlation
Gross Weight is highly overall correlated with Liability m₽ and 1 other fieldsHigh correlation
Group is highly overall correlated with DepartmentCodeDescription and 3 other fieldsHigh correlation
Group det is highly overall correlated with DepartmentCodeDescription and 5 other fieldsHigh correlation
Group2 is highly overall correlated with DepartmentCodeDescription and 5 other fieldsHigh correlation
Last Update User is highly overall correlated with Group2 and 2 other fieldsHigh correlation
Liability m₽ is highly overall correlated with Gross Weight and 1 other fieldsHigh correlation
Mode Of Transport Sh is highly overall correlated with Group det and 1 other fieldsHigh correlation
RU&VAT is highly overall correlated with Group detHigh correlation
Revenue (Local) is highly overall correlated with Gross Weight and 1 other fieldsHigh correlation
Shipment # is highly overall correlated with periodHigh correlation
period is highly overall correlated with Shipment #High correlation
Автор is highly overall correlated with Group2 and 1 other fieldsHigh correlation
Статья ДДС is highly overall correlated with DepartmentCodeDescription and 5 other fieldsHigh correlation
RU&VAT is highly imbalanced (91.0%) Imbalance
Gross Weight is highly skewed (γ1 = 38.27961198) Skewed
Liability m₽ is highly skewed (γ1 = 162.353805) Skewed
Shipment # has unique values Unique

Reproduction

Analysis started2025-06-01 10:51:49.579459
Analysis finished2025-06-01 10:52:02.171486
Duration12.59 seconds
Software versionydata-profiling vv4.16.1
Download configurationconfig.json

Variables

Shipment #
Real number (ℝ)

High correlation  Unique 

Distinct121090
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean769643.97
Minimum606011
Maximum831457
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size946.1 KiB
2025-06-01T13:52:02.262646image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum606011
5-th percentile714117.45
Q1738696.25
median769575.5
Q3800676.75
95-th percentile825343.55
Maximum831457
Range225446
Interquartile range (IQR)61980.5

Descriptive statistics

Standard deviation35743.106
Coefficient of variation (CV)0.046441092
Kurtosis-1.176826
Mean769643.97
Median Absolute Deviation (MAD)30987
Skewness-0.001974286
Sum9.3196188 × 1010
Variance1.2775696 × 109
MonotonicityNot monotonic
2025-06-01T13:52:02.381186image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
708064 1
 
< 0.1%
789847 1
 
< 0.1%
790419 1
 
< 0.1%
790370 1
 
< 0.1%
790583 1
 
< 0.1%
790422 1
 
< 0.1%
790421 1
 
< 0.1%
790407 1
 
< 0.1%
790311 1
 
< 0.1%
790281 1
 
< 0.1%
Other values (121080) 121080
> 99.9%
ValueCountFrequency (%)
606011 1
< 0.1%
606012 1
< 0.1%
606013 1
< 0.1%
606017 1
< 0.1%
606021 1
< 0.1%
606022 1
< 0.1%
606023 1
< 0.1%
606028 1
< 0.1%
606037 1
< 0.1%
706390 1
< 0.1%
ValueCountFrequency (%)
831457 1
< 0.1%
831456 1
< 0.1%
831455 1
< 0.1%
831454 1
< 0.1%
831453 1
< 0.1%
831452 1
< 0.1%
831451 1
< 0.1%
831450 1
< 0.1%
831449 1
< 0.1%
831448 1
< 0.1%
Distinct876
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size946.1 KiB
2025-06-01T13:52:02.777248image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.9999917
Min length7

Characters and Unicode

Total characters968719
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique43 ?
Unique (%)< 0.1%

Sample

1st row31/12/22
2nd row29/12/22
3rd row30/12/22
4th row28/12/22
5th row28/12/22
ValueCountFrequency (%)
01/04/24 527
 
0.4%
10/06/24 441
 
0.4%
18/03/24 419
 
0.3%
04/03/24 414
 
0.3%
16/09/24 411
 
0.3%
22/04/24 408
 
0.3%
30/09/24 400
 
0.3%
20/05/24 393
 
0.3%
12/05/25 392
 
0.3%
29/01/24 390
 
0.3%
Other values (866) 116895
96.5%
2025-06-01T13:52:03.250778image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 242178
25.0%
2 196303
20.3%
0 146967
15.2%
1 100853
10.4%
4 82556
 
8.5%
3 74631
 
7.7%
5 43153
 
4.5%
7 21591
 
2.2%
6 20771
 
2.1%
9 19937
 
2.1%
Other values (8) 19779
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 726534
75.0%
Other Punctuation 242180
 
25.0%
Uppercase Letter 5
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 196303
27.0%
0 146967
20.2%
1 100853
13.9%
4 82556
11.4%
3 74631
 
10.3%
5 43153
 
5.9%
7 21591
 
3.0%
6 20771
 
2.9%
9 19937
 
2.7%
8 19772
 
2.7%
Uppercase Letter
ValueCountFrequency (%)
V 1
20.0%
A 1
20.0%
L 1
20.0%
U 1
20.0%
E 1
20.0%
Other Punctuation
ValueCountFrequency (%)
/ 242178
> 99.9%
# 1
 
< 0.1%
! 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 968714
> 99.9%
Latin 5
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 242178
25.0%
2 196303
20.3%
0 146967
15.2%
1 100853
10.4%
4 82556
 
8.5%
3 74631
 
7.7%
5 43153
 
4.5%
7 21591
 
2.2%
6 20771
 
2.1%
9 19937
 
2.1%
Other values (3) 19774
 
2.0%
Latin
ValueCountFrequency (%)
V 1
20.0%
A 1
20.0%
L 1
20.0%
U 1
20.0%
E 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 968719
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 242178
25.0%
2 196303
20.3%
0 146967
15.2%
1 100853
10.4%
4 82556
 
8.5%
3 74631
 
7.7%
5 43153
 
4.5%
7 21591
 
2.2%
6 20771
 
2.1%
9 19937
 
2.1%
Other values (8) 19779
 
2.0%
Distinct884
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size946.1 KiB
Minimum2023-01-01 00:00:00
Maximum2025-12-05 00:00:00
Invalid dates0
Invalid dates (%)0.0%
2025-06-01T13:52:03.375331image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-01T13:52:03.497658image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct741
Distinct (%)0.6%
Missing9
Missing (%)< 0.1%
Memory size946.1 KiB
Minimum2022-12-26 00:00:00
Maximum2025-12-05 00:00:00
Invalid dates0
Invalid dates (%)0.0%
2025-06-01T13:52:03.636717image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-01T13:52:03.785090image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Last Update User
Categorical

High correlation 

Distinct30
Distinct (%)< 0.1%
Missing9
Missing (%)< 0.1%
Memory size946.1 KiB
Косарынская Людмила
31891 
Дорофеева Елена
15474 
Солодова Елена
15351 
Новикова Дарья
12084 
Миропольский Артем
9961 
Other values (25)
36320 

Length

Max length31
Median length28
Mean length16.219481
Min length5

Characters and Unicode

Total characters1963871
Distinct characters52
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowСкопенко Анна Алексеевна
2nd rowНовикова Дарья
3rd rowКосарынская Людмила
4th rowКудрявцева Надежда
5th rowКудрявцева Надежда

Common Values

ValueCountFrequency (%)
Косарынская Людмила 31891
26.3%
Дорофеева Елена 15474
12.8%
Солодова Елена 15351
12.7%
Новикова Дарья 12084
 
10.0%
Миропольский Артем 9961
 
8.2%
Чурилов Юрий 9525
 
7.9%
Кудрявцева Надежда 6407
 
5.3%
webex 6066
 
5.0%
Решимова Екатерина 2980
 
2.5%
Самойлова Екатерина 2748
 
2.3%
Other values (20) 8594
 
7.1%

Length

2025-06-01T13:52:03.918670image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
косарынская 31891
13.2%
людмила 31891
13.2%
елена 31016
12.8%
дорофеева 15474
 
6.4%
солодова 15351
 
6.3%
новикова 12084
 
5.0%
дарья 12084
 
5.0%
артем 9961
 
4.1%
миропольский 9961
 
4.1%
чурилов 9525
 
3.9%
Other values (56) 63200
26.1%

Most occurring characters

ValueCountFrequency (%)
а 234178
 
11.9%
о 186870
 
9.5%
121357
 
6.2%
р 118799
 
6.0%
е 111840
 
5.7%
л 109677
 
5.6%
и 106242
 
5.4%
в 99423
 
5.1%
н 82119
 
4.2%
с 80985
 
4.1%
Other values (42) 712381
36.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1606142
81.8%
Uppercase Letter 236372
 
12.0%
Space Separator 121357
 
6.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
а 234178
14.6%
о 186870
11.6%
р 118799
 
7.4%
е 111840
 
7.0%
л 109677
 
6.8%
и 106242
 
6.6%
в 99423
 
6.2%
н 82119
 
5.1%
с 80985
 
5.0%
д 70577
 
4.4%
Other values (23) 405432
25.2%
Uppercase Letter
ValueCountFrequency (%)
К 38775
16.4%
Е 36778
15.6%
Л 32166
13.6%
Д 29643
12.5%
С 20475
8.7%
Н 18682
7.9%
А 16048
6.8%
М 10992
 
4.7%
Ч 10556
 
4.5%
Ю 10526
 
4.5%
Other values (8) 11731
 
5.0%
Space Separator
ValueCountFrequency (%)
121357
100.0%

Most occurring scripts

ValueCountFrequency (%)
Cyrillic 1812184
92.3%
Common 121357
 
6.2%
Latin 30330
 
1.5%

Most frequent character per script

Cyrillic
ValueCountFrequency (%)
а 234178
 
12.9%
о 186870
 
10.3%
р 118799
 
6.6%
е 111840
 
6.2%
л 109677
 
6.1%
и 106242
 
5.9%
в 99423
 
5.5%
н 82119
 
4.5%
с 80985
 
4.5%
д 70577
 
3.9%
Other values (37) 611474
33.7%
Latin
ValueCountFrequency (%)
e 12132
40.0%
b 6066
20.0%
x 6066
20.0%
w 6066
20.0%
Common
ValueCountFrequency (%)
121357
100.0%

Most occurring blocks

ValueCountFrequency (%)
Cyrillic 1812184
92.3%
ASCII 151687
 
7.7%

Most frequent character per block

Cyrillic
ValueCountFrequency (%)
а 234178
 
12.9%
о 186870
 
10.3%
р 118799
 
6.6%
е 111840
 
6.2%
л 109677
 
6.1%
и 106242
 
5.9%
в 99423
 
5.5%
н 82119
 
4.5%
с 80985
 
4.5%
д 70577
 
3.9%
Other values (37) 611474
33.7%
ASCII
ValueCountFrequency (%)
121357
80.0%
e 12132
 
8.0%
b 6066
 
4.0%
x 6066
 
4.0%
w 6066
 
4.0%
Distinct706
Distinct (%)0.6%
Missing9
Missing (%)< 0.1%
Memory size946.1 KiB
Minimum2022-12-14 00:00:00
Maximum2025-12-05 00:00:00
Invalid dates0
Invalid dates (%)0.0%
2025-06-01T13:52:04.039438image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-01T13:52:04.177297image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Автор
Categorical

High correlation 

Distinct18
Distinct (%)< 0.1%
Missing9
Missing (%)< 0.1%
Memory size946.1 KiB
Косарынская Людмила
30543 
Дорофеева Елена
22500 
Миропольский Артем
15468 
Новикова Дарья
12594 
Солодова Елена
10847 
Other values (13)
29129 

Length

Max length28
Median length25
Mean length16.784268
Min length12

Characters and Unicode

Total characters2032256
Distinct characters46
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowСкопенко Анна Алексеевна
2nd rowНовикова Дарья
3rd rowКосарынская Людмила
4th rowКосарынская Людмила
5th rowКосарынская Людмила

Common Values

ValueCountFrequency (%)
Косарынская Людмила 30543
25.2%
Дорофеева Елена 22500
18.6%
Миропольский Артем 15468
12.8%
Новикова Дарья 12594
10.4%
Солодова Елена 10847
 
9.0%
Чурилов Юрий 9470
 
7.8%
Решимова Екатерина 5119
 
4.2%
Стенягина Елена Николаевна 4566
 
3.8%
Кудрявцева Надежда 3722
 
3.1%
Самойлова Екатерина 3631
 
3.0%
Other values (8) 2621
 
2.2%

Length

2025-06-01T13:52:04.319288image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
елена 37913
15.3%
косарынская 30543
12.4%
людмила 30543
12.4%
дорофеева 22500
9.1%
артем 15470
 
6.3%
миропольский 15468
 
6.3%
новикова 12594
 
5.1%
дарья 12594
 
5.1%
солодова 10847
 
4.4%
юрий 9470
 
3.8%
Other values (27) 49226
19.9%

Most occurring characters

ValueCountFrequency (%)
а 250238
 
12.3%
о 195846
 
9.6%
е 130596
 
6.4%
р 127998
 
6.3%
126087
 
6.2%
и 118188
 
5.8%
л 113148
 
5.6%
н 95083
 
4.7%
в 91403
 
4.5%
с 80819
 
4.0%
Other values (36) 702850
34.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1659001
81.6%
Uppercase Letter 247168
 
12.2%
Space Separator 126087
 
6.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
а 250238
15.1%
о 195846
11.8%
е 130596
 
7.9%
р 127998
 
7.7%
и 118188
 
7.1%
л 113148
 
6.8%
н 95083
 
5.7%
в 91403
 
5.5%
с 80819
 
4.9%
к 75148
 
4.5%
Other values (17) 380534
22.9%
Uppercase Letter
ValueCountFrequency (%)
Е 46663
18.9%
Д 35094
14.2%
К 34265
13.9%
Л 30543
12.4%
Н 20882
8.4%
С 19483
7.9%
А 18275
 
7.4%
М 15468
 
6.3%
Ч 9470
 
3.8%
Ю 9470
 
3.8%
Other values (8) 7555
 
3.1%
Space Separator
ValueCountFrequency (%)
126087
100.0%

Most occurring scripts

ValueCountFrequency (%)
Cyrillic 1906169
93.8%
Common 126087
 
6.2%

Most frequent character per script

Cyrillic
ValueCountFrequency (%)
а 250238
 
13.1%
о 195846
 
10.3%
е 130596
 
6.9%
р 127998
 
6.7%
и 118188
 
6.2%
л 113148
 
5.9%
н 95083
 
5.0%
в 91403
 
4.8%
с 80819
 
4.2%
к 75148
 
3.9%
Other values (35) 627702
32.9%
Common
ValueCountFrequency (%)
126087
100.0%

Most occurring blocks

ValueCountFrequency (%)
Cyrillic 1906169
93.8%
ASCII 126087
 
6.2%

Most frequent character per block

Cyrillic
ValueCountFrequency (%)
а 250238
 
13.1%
о 195846
 
10.3%
е 130596
 
6.9%
р 127998
 
6.7%
и 118188
 
6.2%
л 113148
 
5.9%
н 95083
 
5.0%
в 91403
 
4.8%
с 80819
 
4.2%
к 75148
 
3.9%
Other values (35) 627702
32.9%
ASCII
ValueCountFrequency (%)
126087
100.0%

Статья ДДС
Categorical

High correlation 

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size946.1 KiB
ТРЕЙДИНГ ДМ
23927 
БАНКИ
21277 
ИНКАССАЦИЯ
20710 
ЮВЕЛИРЫ
20260 
ЛОМБАРДЫ
18514 
Other values (7)
16402 

Length

Max length24
Median length16
Mean length8.8035511
Min length5

Characters and Unicode

Total characters1066022
Distinct characters31
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowПРОЧЕЕ
2nd rowТРЕЙДИНГ ДМ
3rd rowБАНКИ
4th rowАВТОКАТАЛИЗАТОРЫ
5th rowАВТОКАТАЛИЗАТОРЫ

Common Values

ValueCountFrequency (%)
ТРЕЙДИНГ ДМ 23927
19.8%
БАНКИ 21277
17.6%
ИНКАССАЦИЯ 20710
17.1%
ЮВЕЛИРЫ 20260
16.7%
ЛОМБАРДЫ 18514
15.3%
АФФ.ЗАВОДЫ 8326
 
6.9%
ПРОМЫШЛЕННОСТЬ 2327
 
1.9%
НЕДРОПОЛЬЗОВАТЕЛИ 1818
 
1.5%
АВТОКАТАЛИЗАТОРЫ 1402
 
1.2%
ПРОЧЕЕ 1232
 
1.0%
Other values (2) 1297
 
1.1%

Length

2025-06-01T13:52:04.445413image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
трейдинг 23927
16.2%
дм 23927
16.2%
банки 21277
14.4%
инкассация 20710
14.0%
ювелиры 20260
13.7%
ломбарды 18514
12.5%
афф.заводы 8326
 
5.6%
промышленность 2327
 
1.6%
недропользователи 1818
 
1.2%
автокатализаторы 1402
 
0.9%
Other values (7) 5645
 
3.8%

Most occurring characters

ValueCountFrequency (%)
И 112959
 
10.6%
А 107365
 
10.1%
Д 76512
 
7.2%
Н 74980
 
7.0%
Р 70777
 
6.6%
Е 53650
 
5.0%
Ы 51608
 
4.8%
Л 46918
 
4.4%
М 44768
 
4.2%
К 44168
 
4.1%
Other values (21) 382317
35.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1029095
96.5%
Space Separator 27043
 
2.5%
Other Punctuation 8326
 
0.8%
Open Punctuation 779
 
0.1%
Close Punctuation 779
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
И 112959
 
11.0%
А 107365
 
10.4%
Д 76512
 
7.4%
Н 74980
 
7.3%
Р 70777
 
6.9%
Е 53650
 
5.2%
Ы 51608
 
5.0%
Л 46918
 
4.6%
М 44768
 
4.4%
К 44168
 
4.3%
Other values (17) 345390
33.6%
Space Separator
ValueCountFrequency (%)
27043
100.0%
Other Punctuation
ValueCountFrequency (%)
. 8326
100.0%
Open Punctuation
ValueCountFrequency (%)
( 779
100.0%
Close Punctuation
ValueCountFrequency (%)
) 779
100.0%

Most occurring scripts

ValueCountFrequency (%)
Cyrillic 1029095
96.5%
Common 36927
 
3.5%

Most frequent character per script

Cyrillic
ValueCountFrequency (%)
И 112959
 
11.0%
А 107365
 
10.4%
Д 76512
 
7.4%
Н 74980
 
7.3%
Р 70777
 
6.9%
Е 53650
 
5.2%
Ы 51608
 
5.0%
Л 46918
 
4.6%
М 44768
 
4.4%
К 44168
 
4.3%
Other values (17) 345390
33.6%
Common
ValueCountFrequency (%)
27043
73.2%
. 8326
 
22.5%
( 779
 
2.1%
) 779
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
Cyrillic 1029095
96.5%
ASCII 36927
 
3.5%

Most frequent character per block

Cyrillic
ValueCountFrequency (%)
И 112959
 
11.0%
А 107365
 
10.4%
Д 76512
 
7.4%
Н 74980
 
7.3%
Р 70777
 
6.9%
Е 53650
 
5.2%
Ы 51608
 
5.0%
Л 46918
 
4.6%
М 44768
 
4.4%
К 44168
 
4.3%
Other values (17) 345390
33.6%
ASCII
ValueCountFrequency (%)
27043
73.2%
. 8326
 
22.5%
( 779
 
2.1%
) 779
 
2.1%
Distinct658
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size946.1 KiB
2025-06-01T13:52:04.807222image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length77
Median length70
Mean length13.705467
Min length3

Characters and Unicode

Total characters1659595
Distinct characters86
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique119 ?
Unique (%)0.1%

Sample

1st rowИрАэро Авиакомпания АО
2nd rowТехномет
3rd rowРОСБАНК ПАО
4th rowЕврокат Волга ООО
5th rowЕвромет Север ООО
ValueCountFrequency (%)
пао 32394
 
11.7%
ооо 26707
 
9.7%
росбанк 23845
 
8.6%
ломбард 18511
 
6.7%
мюз 10790
 
3.9%
сольфер 9398
 
3.4%
9170
 
3.3%
оао 8184
 
3.0%
красцветмет 8159
 
3.0%
авто 7418
 
2.7%
Other values (902) 121545
44.0%
2025-06-01T13:52:05.466993image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
О 198385
 
12.0%
155038
 
9.3%
А 111502
 
6.7%
а 74261
 
4.5%
о 63388
 
3.8%
К 57401
 
3.5%
С 56538
 
3.4%
Р 55397
 
3.3%
е 52635
 
3.2%
т 50984
 
3.1%
Other values (76) 784066
47.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 839737
50.6%
Lowercase Letter 642108
38.7%
Space Separator 155038
 
9.3%
Dash Punctuation 14897
 
0.9%
Close Punctuation 2585
 
0.2%
Open Punctuation 2585
 
0.2%
Other Punctuation 1620
 
0.1%
Decimal Number 495
 
< 0.1%
Initial Punctuation 265
 
< 0.1%
Final Punctuation 265
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
О 198385
23.6%
А 111502
13.3%
К 57401
 
6.8%
С 56538
 
6.7%
Р 55397
 
6.6%
П 49287
 
5.9%
Л 48189
 
5.7%
Б 44349
 
5.3%
Н 38535
 
4.6%
Т 23392
 
2.8%
Other values (23) 156762
18.7%
Lowercase Letter
ValueCountFrequency (%)
а 74261
11.6%
о 63388
 
9.9%
е 52635
 
8.2%
т 50984
 
7.9%
р 47317
 
7.4%
м 43060
 
6.7%
в 38767
 
6.0%
и 35815
 
5.6%
н 32356
 
5.0%
л 32225
 
5.0%
Other values (23) 171300
26.7%
Decimal Number
ValueCountFrequency (%)
1 127
25.7%
7 114
23.0%
9 90
18.2%
2 70
14.1%
4 40
 
8.1%
6 31
 
6.3%
5 8
 
1.6%
0 8
 
1.6%
8 6
 
1.2%
3 1
 
0.2%
Other Punctuation
ValueCountFrequency (%)
. 1142
70.5%
, 446
 
27.5%
" 32
 
2.0%
Dash Punctuation
ValueCountFrequency (%)
- 14822
99.5%
75
 
0.5%
Space Separator
ValueCountFrequency (%)
155038
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2585
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2585
100.0%
Initial Punctuation
ValueCountFrequency (%)
« 265
100.0%
Final Punctuation
ValueCountFrequency (%)
» 265
100.0%

Most occurring scripts

ValueCountFrequency (%)
Cyrillic 1481845
89.3%
Common 177750
 
10.7%

Most frequent character per script

Cyrillic
ValueCountFrequency (%)
О 198385
 
13.4%
А 111502
 
7.5%
а 74261
 
5.0%
о 63388
 
4.3%
К 57401
 
3.9%
С 56538
 
3.8%
Р 55397
 
3.7%
е 52635
 
3.6%
т 50984
 
3.4%
П 49287
 
3.3%
Other values (56) 712067
48.1%
Common
ValueCountFrequency (%)
155038
87.2%
- 14822
 
8.3%
) 2585
 
1.5%
( 2585
 
1.5%
. 1142
 
0.6%
, 446
 
0.3%
« 265
 
0.1%
» 265
 
0.1%
1 127
 
0.1%
7 114
 
0.1%
Other values (10) 361
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Cyrillic 1481845
89.3%
ASCII 177145
 
10.7%
None 530
 
< 0.1%
Punctuation 75
 
< 0.1%

Most frequent character per block

Cyrillic
ValueCountFrequency (%)
О 198385
 
13.4%
А 111502
 
7.5%
а 74261
 
5.0%
о 63388
 
4.3%
К 57401
 
3.9%
С 56538
 
3.8%
Р 55397
 
3.7%
е 52635
 
3.6%
т 50984
 
3.4%
П 49287
 
3.3%
Other values (56) 712067
48.1%
ASCII
ValueCountFrequency (%)
155038
87.5%
- 14822
 
8.4%
) 2585
 
1.5%
( 2585
 
1.5%
. 1142
 
0.6%
, 446
 
0.3%
1 127
 
0.1%
7 114
 
0.1%
9 90
 
0.1%
2 70
 
< 0.1%
Other values (7) 126
 
0.1%
None
ValueCountFrequency (%)
« 265
50.0%
» 265
50.0%
Punctuation
ValueCountFrequency (%)
75
100.0%
Distinct260
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size946.1 KiB
2025-06-01T13:52:05.927185image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length24
Median length6
Mean length7.1420018
Min length2

Characters and Unicode

Total characters864825
Distinct characters65
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50 ?
Unique (%)< 0.1%

Sample

1st rowМосква
2nd rowМосква
3rd rowМосква
4th rowНижний Новгород
5th rowСанкт-Петербург
ValueCountFrequency (%)
москва 83338
67.4%
красноярск 11168
 
9.0%
санкт-петербург 3444
 
2.8%
касимов 2255
 
1.8%
кострома 1528
 
1.2%
екатеринбург 1286
 
1.0%
новосибирск 1194
 
1.0%
бишкек 1087
 
0.9%
магадан 989
 
0.8%
верхняя 867
 
0.7%
Other values (269) 16471
 
13.3%
2025-06-01T13:52:06.569556image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
с 117104
13.5%
о 117048
13.5%
а 116962
13.5%
к 108070
12.5%
в 91741
10.6%
М 84934
9.8%
р 42798
 
4.9%
н 25688
 
3.0%
К 16768
 
1.9%
е 16358
 
1.9%
Other values (55) 127354
14.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 728481
84.2%
Uppercase Letter 127888
 
14.8%
Dash Punctuation 5424
 
0.6%
Space Separator 2537
 
0.3%
Other Punctuation 465
 
0.1%
Decimal Number 30
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
с 117104
16.1%
о 117048
16.1%
а 116962
16.1%
к 108070
14.8%
в 91741
12.6%
р 42798
 
5.9%
н 25688
 
3.5%
е 16358
 
2.2%
я 13872
 
1.9%
т 13434
 
1.8%
Other values (22) 65406
9.0%
Uppercase Letter
ValueCountFrequency (%)
М 84934
66.4%
К 16768
 
13.1%
П 5007
 
3.9%
С 4991
 
3.9%
В 2875
 
2.2%
Н 2818
 
2.2%
Е 1832
 
1.4%
Б 1270
 
1.0%
Д 1157
 
0.9%
А 1121
 
0.9%
Other values (18) 5115
 
4.0%
Other Punctuation
ValueCountFrequency (%)
. 451
97.0%
, 14
 
3.0%
Dash Punctuation
ValueCountFrequency (%)
- 5424
100.0%
Space Separator
ValueCountFrequency (%)
2537
100.0%
Decimal Number
ValueCountFrequency (%)
2 30
100.0%

Most occurring scripts

ValueCountFrequency (%)
Cyrillic 856369
99.0%
Common 8456
 
1.0%

Most frequent character per script

Cyrillic
ValueCountFrequency (%)
с 117104
13.7%
о 117048
13.7%
а 116962
13.7%
к 108070
12.6%
в 91741
10.7%
М 84934
9.9%
р 42798
 
5.0%
н 25688
 
3.0%
К 16768
 
2.0%
е 16358
 
1.9%
Other values (50) 118898
13.9%
Common
ValueCountFrequency (%)
- 5424
64.1%
2537
30.0%
. 451
 
5.3%
2 30
 
0.4%
, 14
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Cyrillic 856369
99.0%
ASCII 8456
 
1.0%

Most frequent character per block

Cyrillic
ValueCountFrequency (%)
с 117104
13.7%
о 117048
13.7%
а 116962
13.7%
к 108070
12.6%
в 91741
10.7%
М 84934
9.9%
р 42798
 
5.0%
н 25688
 
3.0%
К 16768
 
2.0%
е 16358
 
1.9%
Other values (50) 118898
13.9%
ASCII
ValueCountFrequency (%)
- 5424
64.1%
2537
30.0%
. 451
 
5.3%
2 30
 
0.4%
, 14
 
0.2%
Distinct320
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size946.1 KiB
2025-06-01T13:52:07.014646image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length24
Median length6
Mean length7.4262532
Min length3

Characters and Unicode

Total characters899245
Distinct characters65
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique55 ?
Unique (%)< 0.1%

Sample

1st rowИркутск
2nd rowКаменск-Уральский
3rd rowМосква
4th rowКрасноярск
5th rowКрасноярск
ValueCountFrequency (%)
москва 76438
62.0%
красноярск 8936
 
7.3%
кострома 5580
 
4.5%
санкт-петербург 4793
 
3.9%
екатеринбург 1694
 
1.4%
красное-на-волге 1689
 
1.4%
касимов 1572
 
1.3%
новосибирск 1081
 
0.9%
новгород 762
 
0.6%
казань 691
 
0.6%
Other values (330) 19987
 
16.2%
2025-06-01T13:52:07.599760image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
о 123549
13.7%
а 116570
13.0%
с 112009
12.5%
к 100494
11.2%
в 86200
9.6%
М 76995
8.6%
р 50409
 
5.6%
н 28941
 
3.2%
е 22681
 
2.5%
т 21575
 
2.4%
Other values (55) 159822
17.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 756536
84.1%
Uppercase Letter 130332
 
14.5%
Dash Punctuation 9783
 
1.1%
Space Separator 2133
 
0.2%
Other Punctuation 433
 
< 0.1%
Decimal Number 28
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
о 123549
16.3%
а 116570
15.4%
с 112009
14.8%
к 100494
13.3%
в 86200
11.4%
р 50409
6.7%
н 28941
 
3.8%
е 22681
 
3.0%
т 21575
 
2.9%
и 13079
 
1.7%
Other values (22) 81029
10.7%
Uppercase Letter
ValueCountFrequency (%)
М 76995
59.1%
К 20238
 
15.5%
С 7180
 
5.5%
П 6511
 
5.0%
В 3982
 
3.1%
Н 3134
 
2.4%
Е 2362
 
1.8%
Т 1248
 
1.0%
Д 1235
 
0.9%
Р 1043
 
0.8%
Other values (18) 6404
 
4.9%
Other Punctuation
ValueCountFrequency (%)
. 420
97.0%
, 13
 
3.0%
Dash Punctuation
ValueCountFrequency (%)
- 9783
100.0%
Space Separator
ValueCountFrequency (%)
2133
100.0%
Decimal Number
ValueCountFrequency (%)
2 28
100.0%

Most occurring scripts

ValueCountFrequency (%)
Cyrillic 886868
98.6%
Common 12377
 
1.4%

Most frequent character per script

Cyrillic
ValueCountFrequency (%)
о 123549
13.9%
а 116570
13.1%
с 112009
12.6%
к 100494
11.3%
в 86200
9.7%
М 76995
8.7%
р 50409
 
5.7%
н 28941
 
3.3%
е 22681
 
2.6%
т 21575
 
2.4%
Other values (50) 147445
16.6%
Common
ValueCountFrequency (%)
- 9783
79.0%
2133
 
17.2%
. 420
 
3.4%
2 28
 
0.2%
, 13
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Cyrillic 886868
98.6%
ASCII 12377
 
1.4%

Most frequent character per block

Cyrillic
ValueCountFrequency (%)
о 123549
13.9%
а 116570
13.1%
с 112009
12.6%
к 100494
11.3%
в 86200
9.7%
М 76995
8.7%
р 50409
 
5.7%
н 28941
 
3.3%
е 22681
 
2.6%
т 21575
 
2.4%
Other values (50) 147445
16.6%
ASCII
ValueCountFrequency (%)
- 9783
79.0%
2133
 
17.2%
. 420
 
3.4%
2 28
 
0.2%
, 13
 
0.1%

Mode Of Transport Sh
Categorical

High correlation 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size946.1 KiB
Road
75704 
Air
44518 
Rail
 
863
SubTr
 
5

Length

Max length5
Median length4
Mean length3.6323974
Min length3

Characters and Unicode

Total characters439847
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAir
2nd rowAir
3rd rowRoad
4th rowRail
5th rowRail

Common Values

ValueCountFrequency (%)
Road 75704
62.5%
Air 44518
36.8%
Rail 863
 
0.7%
SubTr 5
 
< 0.1%

Length

2025-06-01T13:52:07.742035image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-06-01T13:52:07.883307image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
road 75704
62.5%
air 44518
36.8%
rail 863
 
0.7%
subtr 5
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
R 76567
17.4%
a 76567
17.4%
o 75704
17.2%
d 75704
17.2%
i 45381
10.3%
r 44523
10.1%
A 44518
10.1%
l 863
 
0.2%
S 5
 
< 0.1%
u 5
 
< 0.1%
Other values (2) 10
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 318752
72.5%
Uppercase Letter 121095
 
27.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 76567
24.0%
o 75704
23.8%
d 75704
23.8%
i 45381
14.2%
r 44523
14.0%
l 863
 
0.3%
u 5
 
< 0.1%
b 5
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
R 76567
63.2%
A 44518
36.8%
S 5
 
< 0.1%
T 5
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 439847
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 76567
17.4%
a 76567
17.4%
o 75704
17.2%
d 75704
17.2%
i 45381
10.3%
r 44523
10.1%
A 44518
10.1%
l 863
 
0.2%
S 5
 
< 0.1%
u 5
 
< 0.1%
Other values (2) 10
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 439847
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 76567
17.4%
a 76567
17.4%
o 75704
17.2%
d 75704
17.2%
i 45381
10.3%
r 44523
10.1%
A 44518
10.1%
l 863
 
0.2%
S 5
 
< 0.1%
u 5
 
< 0.1%
Other values (2) 10
 
< 0.1%

DepartmentCodeDescription
Categorical

High correlation 

Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size946.1 KiB
Banknotes
39807 
DJ
31837 
Prec.Metals
23245 
Gold
14345 
Silver
5313 
Other values (10)
6543 

Length

Max length11
Median length9
Mean length6.6545132
Min length2

Characters and Unicode

Total characters805795
Distinct characters32
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOther
2nd rowPrec.Metals
3rd rowBanknotes
4th rowCatalyst
5th rowCatalyst

Common Values

ValueCountFrequency (%)
Banknotes 39807
32.9%
DJ 31837
26.3%
Prec.Metals 23245
19.2%
Gold 14345
 
11.8%
Silver 5313
 
4.4%
ATM 1824
 
1.5%
Diamonds 1472
 
1.2%
Catalyst 1250
 
1.0%
Other 1148
 
0.9%
Jewellery 195
 
0.2%
Other values (5) 654
 
0.5%

Length

2025-06-01T13:52:07.996497image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
banknotes 39807
32.9%
dj 31837
26.3%
prec.metals 23245
19.2%
gold 14345
 
11.8%
silver 5313
 
4.4%
atm 1824
 
1.5%
diamonds 1472
 
1.2%
catalyst 1250
 
1.0%
other 1148
 
0.9%
jewellery 195
 
0.2%
Other values (6) 686
 
0.6%

Most occurring characters

ValueCountFrequency (%)
e 93565
11.6%
n 81399
 
10.1%
a 67618
 
8.4%
t 66855
 
8.3%
s 66087
 
8.2%
o 55937
 
6.9%
l 44543
 
5.5%
B 39807
 
4.9%
k 39807
 
4.9%
D 33499
 
4.2%
Other values (22) 216678
26.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 602666
74.8%
Uppercase Letter 179852
 
22.3%
Other Punctuation 23245
 
2.9%
Space Separator 32
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 93565
15.5%
n 81399
13.5%
a 67618
11.2%
t 66855
11.1%
s 66087
11.0%
o 55937
9.3%
l 44543
7.4%
k 39807
6.6%
r 30464
 
5.1%
c 23245
 
3.9%
Other values (9) 33146
 
5.5%
Uppercase Letter
ValueCountFrequency (%)
B 39807
22.1%
D 33499
18.6%
J 32032
17.8%
M 25069
13.9%
P 23431
13.0%
G 14345
 
8.0%
S 5313
 
3.0%
A 1947
 
1.1%
T 1824
 
1.0%
C 1437
 
0.8%
Other Punctuation
ValueCountFrequency (%)
. 23245
100.0%
Space Separator
ValueCountFrequency (%)
32
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 782518
97.1%
Common 23277
 
2.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 93565
12.0%
n 81399
10.4%
a 67618
 
8.6%
t 66855
 
8.5%
s 66087
 
8.4%
o 55937
 
7.1%
l 44543
 
5.7%
B 39807
 
5.1%
k 39807
 
5.1%
D 33499
 
4.3%
Other values (20) 193401
24.7%
Common
ValueCountFrequency (%)
. 23245
99.9%
32
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 805795
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 93565
11.6%
n 81399
 
10.1%
a 67618
 
8.4%
t 66855
 
8.3%
s 66087
 
8.2%
o 55937
 
6.9%
l 44543
 
5.5%
B 39807
 
4.9%
k 39807
 
4.9%
D 33499
 
4.2%
Other values (22) 216678
26.9%
Distinct28
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size946.1 KiB
Дверь-в-Дверь
21820 
Банкоматы
20610 
Дверь в дверь ОПТИМ
13319 
Доп.адрес по Москве
12929 
Перевозка по Москве
12883 
Other values (23)
39529 

Length

Max length25
Median length24
Mean length14.764142
Min length3

Characters and Unicode

Total characters1787790
Distinct characters51
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowДверь-в-Дверь
2nd rowДверь в дверь ЭКСПРЕСС
3rd rowБанкоматы
4th rowДверь-в-Дверь
5th rowДверь-в-Дверь

Common Values

ValueCountFrequency (%)
Дверь-в-Дверь 21820
18.0%
Банкоматы 20610
17.0%
Дверь в дверь ОПТИМ 13319
11.0%
Доп.адрес по Москве 12929
10.7%
Перевозка по Москве 12883
10.6%
Доп адрес 6962
 
5.7%
банкоматы срочные 5628
 
4.6%
Встречный груз 4955
 
4.1%
Попутная доставка 3896
 
3.2%
Дверь в дверь ЭКСПРЕСС 3546
 
2.9%
Other values (18) 14542
12.0%

Length

2025-06-01T13:52:08.131569image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
дверь 33920
13.2%
по 26659
10.3%
банкоматы 26238
10.2%
москве 25812
10.0%
дверь-в-дверь 21820
 
8.5%
в 16960
 
6.6%
оптим 14005
 
5.4%
перевозка 13730
 
5.3%
доп.адрес 12929
 
5.0%
доп 6962
 
2.7%
Other values (30) 58858
22.8%

Most occurring characters

ValueCountFrequency (%)
е 178824
 
10.0%
в 169828
 
9.5%
р 144853
 
8.1%
136803
 
7.7%
о 134858
 
7.5%
а 124294
 
7.0%
Д 81868
 
4.6%
ь 78937
 
4.4%
к 72176
 
4.0%
с 62591
 
3.5%
Other values (41) 602758
33.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1329498
74.4%
Uppercase Letter 263935
 
14.8%
Space Separator 136803
 
7.7%
Dash Punctuation 44331
 
2.5%
Other Punctuation 12929
 
0.7%
Open Punctuation 98
 
< 0.1%
Close Punctuation 98
 
< 0.1%
Decimal Number 98
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
е 178824
13.5%
в 169828
12.8%
р 144853
10.9%
о 134858
10.1%
а 124294
9.3%
ь 78937
 
5.9%
к 72176
 
5.4%
с 62591
 
4.7%
п 52775
 
4.0%
н 48159
 
3.6%
Other values (16) 262203
19.7%
Uppercase Letter
ValueCountFrequency (%)
Д 81868
31.0%
П 40297
15.3%
М 39131
14.8%
Б 20610
 
7.8%
О 14048
 
5.3%
Т 13340
 
5.1%
И 13319
 
5.0%
С 10644
 
4.0%
Р 6073
 
2.3%
К 5387
 
2.0%
Other values (8) 19218
 
7.3%
Decimal Number
ValueCountFrequency (%)
3 80
81.6%
2 18
 
18.4%
Space Separator
ValueCountFrequency (%)
136803
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 44331
100.0%
Other Punctuation
ValueCountFrequency (%)
. 12929
100.0%
Open Punctuation
ValueCountFrequency (%)
( 98
100.0%
Close Punctuation
ValueCountFrequency (%)
) 98
100.0%

Most occurring scripts

ValueCountFrequency (%)
Cyrillic 1593433
89.1%
Common 194357
 
10.9%

Most frequent character per script

Cyrillic
ValueCountFrequency (%)
е 178824
 
11.2%
в 169828
 
10.7%
р 144853
 
9.1%
о 134858
 
8.5%
а 124294
 
7.8%
Д 81868
 
5.1%
ь 78937
 
5.0%
к 72176
 
4.5%
с 62591
 
3.9%
п 52775
 
3.3%
Other values (34) 492429
30.9%
Common
ValueCountFrequency (%)
136803
70.4%
- 44331
 
22.8%
. 12929
 
6.7%
( 98
 
0.1%
) 98
 
0.1%
3 80
 
< 0.1%
2 18
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Cyrillic 1593433
89.1%
ASCII 194357
 
10.9%

Most frequent character per block

Cyrillic
ValueCountFrequency (%)
е 178824
 
11.2%
в 169828
 
10.7%
р 144853
 
9.1%
о 134858
 
8.5%
а 124294
 
7.8%
Д 81868
 
5.1%
ь 78937
 
5.0%
к 72176
 
4.5%
с 62591
 
3.9%
п 52775
 
3.3%
Other values (34) 492429
30.9%
ASCII
ValueCountFrequency (%)
136803
70.4%
- 44331
 
22.8%
. 12929
 
6.7%
( 98
 
0.1%
) 98
 
0.1%
3 80
 
< 0.1%
2 18
 
< 0.1%

Gross Weight
Real number (ℝ)

High correlation  Skewed 

Distinct8770
Distinct (%)7.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean79.373295
Minimum0.001
Maximum71012
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size946.1 KiB
2025-06-01T13:52:08.257883image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0.001
5-th percentile0.89
Q11
median1
Q311
95-th percentile212.955
Maximum71012
Range71011.999
Interquartile range (IQR)10

Descriptive statistics

Standard deviation816.34431
Coefficient of variation (CV)10.284874
Kurtosis2180.5224
Mean79.373295
Median Absolute Deviation (MAD)0.4
Skewness38.279612
Sum9611312.3
Variance666418.03
MonotonicityNot monotonic
2025-06-01T13:52:08.399760image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 56143
46.4%
2 3049
 
2.5%
3 2498
 
2.1%
5 2173
 
1.8%
4 2057
 
1.7%
10 1522
 
1.3%
6 1440
 
1.2%
7 931
 
0.8%
8 879
 
0.7%
20 827
 
0.7%
Other values (8760) 49571
40.9%
ValueCountFrequency (%)
0.001 1
 
< 0.1%
0.004 1
 
< 0.1%
0.01 5
 
< 0.1%
0.02 5
 
< 0.1%
0.032 1
 
< 0.1%
0.038 1
 
< 0.1%
0.04 4
 
< 0.1%
0.043 1
 
< 0.1%
0.05 27
< 0.1%
0.057 1
 
< 0.1%
ValueCountFrequency (%)
71012 1
< 0.1%
69000 1
< 0.1%
68000 1
< 0.1%
50286 1
< 0.1%
45100 1
< 0.1%
44467 1
< 0.1%
42000 1
< 0.1%
41400 1
< 0.1%
41097 1
< 0.1%
40000 1
< 0.1%

Revenue (Local)
Real number (ℝ)

High correlation 

Distinct44436
Distinct (%)36.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18845.586
Minimum0
Maximum2592753
Zeros26
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size946.1 KiB
2025-06-01T13:52:08.557266image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile500
Q11346.9075
median3950
Q313509.083
95-th percentile70408.881
Maximum2592753
Range2592753
Interquartile range (IQR)12162.175

Descriptive statistics

Standard deviation69664.971
Coefficient of variation (CV)3.6966201
Kurtosis248.04687
Mean18845.586
Median Absolute Deviation (MAD)3450
Skewness13.12599
Sum2.282012 × 109
Variance4.8532082 × 109
MonotonicityNot monotonic
2025-06-01T13:52:08.945692image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
500 14334
 
11.8%
541.67 6214
 
5.1%
1750 5756
 
4.8%
1350 3773
 
3.1%
1125 3582
 
3.0%
1250 1837
 
1.5%
1450 1731
 
1.4%
1916.67 1510
 
1.2%
13200 1349
 
1.1%
2000 1013
 
0.8%
Other values (44426) 79991
66.1%
ValueCountFrequency (%)
0 26
 
< 0.1%
0.01 297
0.2%
0.14 1
 
< 0.1%
0.48 1
 
< 0.1%
0.67 1
 
< 0.1%
0.72 1
 
< 0.1%
0.74 1
 
< 0.1%
0.85 1
 
< 0.1%
0.86 1
 
< 0.1%
0.93 1
 
< 0.1%
ValueCountFrequency (%)
2592752.98 1
< 0.1%
2458000 1
< 0.1%
2195955.09 1
< 0.1%
2145000 1
< 0.1%
2070000 1
< 0.1%
2008934.92 1
< 0.1%
2005326.67 1
< 0.1%
1960000 1
< 0.1%
1950996.25 1
< 0.1%
1904093.05 1
< 0.1%

period
Categorical

High correlation 

Distinct30
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size946.1 KiB
2023-12
 
5691
2024-04
 
5618
2024-03
 
5433
2024-02
 
5360
2024-05
 
5155
Other values (25)
93833 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters847630
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-01
2nd row2023-01
3rd row2023-01
4th row2023-01
5th row2023-01

Common Values

ValueCountFrequency (%)
2023-12 5691
 
4.7%
2024-04 5618
 
4.6%
2024-03 5433
 
4.5%
2024-02 5360
 
4.4%
2024-05 5155
 
4.3%
2024-07 5096
 
4.2%
2024-06 4891
 
4.0%
2024-10 4815
 
4.0%
2024-08 4523
 
3.7%
2024-09 4457
 
3.7%
Other values (20) 70051
57.9%

Length

2025-06-01T13:52:09.087475image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2023-12 5691
 
4.7%
2024-04 5618
 
4.6%
2024-03 5433
 
4.5%
2024-02 5360
 
4.4%
2024-05 5155
 
4.3%
2024-07 5096
 
4.2%
2024-06 4891
 
4.0%
2024-10 4815
 
4.0%
2024-08 4523
 
3.7%
2024-09 4457
 
3.7%
Other values (20) 70051
57.9%

Most occurring characters

ValueCountFrequency (%)
2 263986
31.1%
0 224293
26.5%
- 121090
14.3%
4 71000
 
8.4%
3 57457
 
6.8%
1 44498
 
5.2%
5 31151
 
3.7%
7 8755
 
1.0%
8 8677
 
1.0%
6 8496
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 726540
85.7%
Dash Punctuation 121090
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 263986
36.3%
0 224293
30.9%
4 71000
 
9.8%
3 57457
 
7.9%
1 44498
 
6.1%
5 31151
 
4.3%
7 8755
 
1.2%
8 8677
 
1.2%
6 8496
 
1.2%
9 8227
 
1.1%
Dash Punctuation
ValueCountFrequency (%)
- 121090
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 847630
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 263986
31.1%
0 224293
26.5%
- 121090
14.3%
4 71000
 
8.4%
3 57457
 
6.8%
1 44498
 
5.2%
5 31151
 
3.7%
7 8755
 
1.0%
8 8677
 
1.0%
6 8496
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 847630
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 263986
31.1%
0 224293
26.5%
- 121090
14.3%
4 71000
 
8.4%
3 57457
 
6.8%
1 44498
 
5.2%
5 31151
 
3.7%
7 8755
 
1.0%
8 8677
 
1.0%
6 8496
 
1.0%

Group
Categorical

High correlation 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size946.1 KiB
Metals
42904 
Banknotes
39929 
D&J
33504 
Domestic
 
1824
Catalyst
 
1261
Other values (5)
 
1668

Length

Max length11
Median length9
Mean length6.2037245
Min length3

Characters and Unicode

Total characters751209
Distinct characters27
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOther
2nd rowMetals
3rd rowBanknotes
4th rowCatalyst
5th rowCatalyst

Common Values

ValueCountFrequency (%)
Metals 42904
35.4%
Banknotes 39929
33.0%
D&J 33504
27.7%
Domestic 1824
 
1.5%
Catalyst 1261
 
1.0%
Other 1137
 
0.9%
Dangerous 190
 
0.2%
Pharma 186
 
0.2%
Art 123
 
0.1%
Credit Card 32
 
< 0.1%

Length

2025-06-01T13:52:09.213371image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-06-01T13:52:09.355206image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
metals 42904
35.4%
banknotes 39929
33.0%
d&j 33504
27.7%
domestic 1824
 
1.5%
catalyst 1261
 
1.0%
other 1137
 
0.9%
dangerous 190
 
0.2%
pharma 186
 
0.2%
art 123
 
0.1%
credit 32
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
t 88471
11.8%
s 86108
11.5%
e 86016
11.5%
a 85949
11.4%
n 80048
10.7%
l 44165
 
5.9%
M 42904
 
5.7%
o 41943
 
5.6%
B 39929
 
5.3%
k 39929
 
5.3%
Other values (17) 115747
15.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 563047
75.0%
Uppercase Letter 154626
 
20.6%
Other Punctuation 33504
 
4.5%
Space Separator 32
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 88471
15.7%
s 86108
15.3%
e 86016
15.3%
a 85949
15.3%
n 80048
14.2%
l 44165
7.8%
o 41943
7.4%
k 39929
7.1%
m 2010
 
0.4%
i 1856
 
0.3%
Other values (7) 6552
 
1.2%
Uppercase Letter
ValueCountFrequency (%)
M 42904
27.7%
B 39929
25.8%
D 35518
23.0%
J 33504
21.7%
C 1325
 
0.9%
O 1137
 
0.7%
P 186
 
0.1%
A 123
 
0.1%
Other Punctuation
ValueCountFrequency (%)
& 33504
100.0%
Space Separator
ValueCountFrequency (%)
32
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 717673
95.5%
Common 33536
 
4.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 88471
12.3%
s 86108
12.0%
e 86016
12.0%
a 85949
12.0%
n 80048
11.2%
l 44165
6.2%
M 42904
6.0%
o 41943
5.8%
B 39929
5.6%
k 39929
5.6%
Other values (15) 82211
11.5%
Common
ValueCountFrequency (%)
& 33504
99.9%
32
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 751209
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 88471
11.8%
s 86108
11.5%
e 86016
11.5%
a 85949
11.4%
n 80048
10.7%
l 44165
 
5.9%
M 42904
 
5.7%
o 41943
 
5.6%
B 39929
 
5.3%
k 39929
 
5.3%
Other values (17) 115747
15.4%

Liability m₽
Real number (ℝ)

High correlation  Skewed 

Distinct53265
Distinct (%)44.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47.856008
Minimum1 × 10-6
Maximum118705.61
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size946.1 KiB
2025-06-01T13:52:09.521978image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1 × 10-6
5-th percentile0.001
Q10.5
median2.6
Q315
95-th percentile150
Maximum118705.61
Range118705.61
Interquartile range (IQR)14.5

Descriptive statistics

Standard deviation505.59862
Coefficient of variation (CV)10.564998
Kurtosis35040.99
Mean47.856008
Median Absolute Deviation (MAD)2.599
Skewness162.35381
Sum5794884
Variance255629.96
MonotonicityNot monotonic
2025-06-01T13:52:09.679439image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.5 12466
 
10.3%
0.001 12004
 
9.9%
0.05 4218
 
3.5%
10 2038
 
1.7%
1 1858
 
1.5%
1 × 10-61528
 
1.3%
0.15 1221
 
1.0%
0.4 864
 
0.7%
2 845
 
0.7%
0.3 788
 
0.7%
Other values (53255) 83260
68.8%
ValueCountFrequency (%)
1 × 10-61528
1.3%
2 × 10-64
 
< 0.1%
3 × 10-61
 
< 0.1%
4 × 10-62
 
< 0.1%
5 × 10-62
 
< 0.1%
7 × 10-61
 
< 0.1%
9 × 10-61
 
< 0.1%
1.9 × 10-51
 
< 0.1%
3 × 10-51
 
< 0.1%
3.1 × 10-51
 
< 0.1%
ValueCountFrequency (%)
118705.6088 1
 
< 0.1%
94259.7687 1
 
< 0.1%
17906.23516 1
 
< 0.1%
11602.38484 3
< 0.1%
9000 1
 
< 0.1%
8758.98 1
 
< 0.1%
8756.5 1
 
< 0.1%
8043.5 1
 
< 0.1%
7500 1
 
< 0.1%
7378.5 1
 
< 0.1%

# Sh
Categorical

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size946.1 KiB
1
121090 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters121090
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 121090
100.0%

Length

2025-06-01T13:52:09.817999image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-06-01T13:52:09.914546image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1 121090
100.0%

Most occurring characters

ValueCountFrequency (%)
1 121090
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 121090
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 121090
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 121090
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 121090
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 121090
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 121090
100.0%

route
Text

Distinct1377
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size946.1 KiB
2025-06-01T13:52:10.241328image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length41
Median length15
Mean length17.568255
Min length11

Characters and Unicode

Total characters2127340
Distinct characters65
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique420 ?
Unique (%)0.3%

Sample

1st rowМосква - Иркутск
2nd rowМосква - Каменск-Уральский
3rd rowМосква - Москва
4th rowНижний Новгород - Красноярск
5th rowСанкт-Петербург - Красноярск
ValueCountFrequency (%)
москва 159776
43.4%
121090
32.9%
красноярск 20104
 
5.5%
санкт-петербург 8237
 
2.2%
кострома 7108
 
1.9%
касимов 3827
 
1.0%
екатеринбург 2980
 
0.8%
новосибирск 2275
 
0.6%
красное-на-волге 1944
 
0.5%
владивосток 1419
 
0.4%
Other values (359) 39180
 
10.6%
2025-06-01T13:52:10.762769image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
246850
11.6%
о 240597
11.3%
а 233532
11.0%
с 229113
10.8%
к 208564
9.8%
в 177941
8.4%
М 161929
7.6%
- 136297
 
6.4%
р 93207
 
4.4%
н 54629
 
2.6%
Other values (55) 344681
16.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1485017
69.8%
Uppercase Letter 258220
 
12.1%
Space Separator 246850
 
11.6%
Dash Punctuation 136297
 
6.4%
Other Punctuation 898
 
< 0.1%
Decimal Number 58
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
о 240597
16.2%
а 233532
15.7%
с 229113
15.4%
к 208564
14.0%
в 177941
12.0%
р 93207
 
6.3%
н 54629
 
3.7%
е 39039
 
2.6%
т 35009
 
2.4%
и 26050
 
1.8%
Other values (22) 147336
9.9%
Uppercase Letter
ValueCountFrequency (%)
М 161929
62.7%
К 37006
 
14.3%
С 12171
 
4.7%
П 11518
 
4.5%
В 6857
 
2.7%
Н 5952
 
2.3%
Е 4194
 
1.6%
Д 2392
 
0.9%
А 2039
 
0.8%
Т 1928
 
0.7%
Other values (18) 12234
 
4.7%
Other Punctuation
ValueCountFrequency (%)
. 871
97.0%
, 27
 
3.0%
Space Separator
ValueCountFrequency (%)
246850
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 136297
100.0%
Decimal Number
ValueCountFrequency (%)
2 58
100.0%

Most occurring scripts

ValueCountFrequency (%)
Cyrillic 1743237
81.9%
Common 384103
 
18.1%

Most frequent character per script

Cyrillic
ValueCountFrequency (%)
о 240597
13.8%
а 233532
13.4%
с 229113
13.1%
к 208564
12.0%
в 177941
10.2%
М 161929
9.3%
р 93207
 
5.3%
н 54629
 
3.1%
е 39039
 
2.2%
К 37006
 
2.1%
Other values (50) 267680
15.4%
Common
ValueCountFrequency (%)
246850
64.3%
- 136297
35.5%
. 871
 
0.2%
2 58
 
< 0.1%
, 27
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Cyrillic 1743237
81.9%
ASCII 384103
 
18.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
246850
64.3%
- 136297
35.5%
. 871
 
0.2%
2 58
 
< 0.1%
, 27
 
< 0.1%
Cyrillic
ValueCountFrequency (%)
о 240597
13.8%
а 233532
13.4%
с 229113
13.1%
к 208564
12.0%
в 177941
10.2%
М 161929
9.3%
р 93207
 
5.3%
н 54629
 
3.1%
е 39039
 
2.2%
К 37006
 
2.1%
Other values (50) 267680
15.4%
Distinct80
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size946.1 KiB
2025-06-01T13:52:11.030546image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length24
Median length7
Mean length7.8143117
Min length3

Characters and Unicode

Total characters946235
Distinct characters55
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st row_Москва
2nd row_Москва
3rd row_Москва
4th rowНижний Новгород
5th rowСанкт-Петербург
ValueCountFrequency (%)
москва 87295
71.0%
красноярск 11172
 
9.1%
санкт-петербург 3629
 
3.0%
екатеринбург 2330
 
1.9%
касимов 2255
 
1.8%
кострома 1877
 
1.5%
новосибирск 1195
 
1.0%
регион 1183
 
1.0%
бишкек 1146
 
0.9%
магадан 1018
 
0.8%
Other values (74) 9884
 
8.0%
2025-06-01T13:52:11.418952image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
с 120014
12.7%
а 119749
12.7%
о 116468
12.3%
к 111168
11.7%
в 94101
9.9%
М 88782
9.4%
_ 87295
9.2%
р 42146
 
4.5%
н 24658
 
2.6%
К 16535
 
1.7%
Other values (45) 125319
13.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 724470
76.6%
Uppercase Letter 127498
 
13.5%
Connector Punctuation 87295
 
9.2%
Dash Punctuation 5078
 
0.5%
Space Separator 1894
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
с 120014
16.6%
а 119749
16.5%
о 116468
16.1%
к 111168
15.3%
в 94101
13.0%
р 42146
 
5.8%
н 24658
 
3.4%
е 14480
 
2.0%
т 14205
 
2.0%
и 12951
 
1.8%
Other values (21) 54530
7.5%
Uppercase Letter
ValueCountFrequency (%)
М 88782
69.6%
К 16535
 
13.0%
С 4999
 
3.9%
П 3972
 
3.1%
Е 2819
 
2.2%
Н 2739
 
2.1%
Р 1850
 
1.5%
В 1599
 
1.3%
Б 1207
 
0.9%
У 699
 
0.5%
Other values (11) 2297
 
1.8%
Connector Punctuation
ValueCountFrequency (%)
_ 87295
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5078
100.0%
Space Separator
ValueCountFrequency (%)
1894
100.0%

Most occurring scripts

ValueCountFrequency (%)
Cyrillic 851968
90.0%
Common 94267
 
10.0%

Most frequent character per script

Cyrillic
ValueCountFrequency (%)
с 120014
14.1%
а 119749
14.1%
о 116468
13.7%
к 111168
13.0%
в 94101
11.0%
М 88782
10.4%
р 42146
 
4.9%
н 24658
 
2.9%
К 16535
 
1.9%
е 14480
 
1.7%
Other values (42) 103867
12.2%
Common
ValueCountFrequency (%)
_ 87295
92.6%
- 5078
 
5.4%
1894
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
Cyrillic 851968
90.0%
ASCII 94267
 
10.0%

Most frequent character per block

Cyrillic
ValueCountFrequency (%)
с 120014
14.1%
а 119749
14.1%
о 116468
13.7%
к 111168
13.0%
в 94101
11.0%
М 88782
10.4%
р 42146
 
4.9%
н 24658
 
2.9%
К 16535
 
1.9%
е 14480
 
1.7%
Other values (42) 103867
12.2%
ASCII
ValueCountFrequency (%)
_ 87295
92.6%
- 5078
 
5.4%
1894
 
2.0%
Distinct81
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size946.1 KiB
2025-06-01T13:52:11.696301image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length24
Median length7
Mean length8.0098357
Min length3

Characters and Unicode

Total characters969911
Distinct characters55
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowИркутск
2nd rowЕкатеринбург
3rd row_Москва
4th rowКрасноярск
5th rowКрасноярск
ValueCountFrequency (%)
москва 81868
65.9%
красноярск 8946
 
7.2%
кострома 7742
 
6.2%
санкт-петербург 5126
 
4.1%
екатеринбург 2332
 
1.9%
регион 2029
 
1.6%
касимов 1572
 
1.3%
новосибирск 1082
 
0.9%
ростов-на-дону 817
 
0.7%
нижний 798
 
0.6%
Other values (75) 11838
 
9.5%
2025-06-01T13:52:12.099805image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
о 123868
12.8%
а 119170
12.3%
с 115874
11.9%
к 103434
10.7%
в 89242
9.2%
М 82253
8.5%
_ 81868
8.4%
р 49298
 
5.1%
н 25710
 
2.7%
т 23724
 
2.4%
Other values (45) 155470
16.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 747607
77.1%
Uppercase Letter 130374
 
13.4%
Connector Punctuation 81868
 
8.4%
Dash Punctuation 7002
 
0.7%
Space Separator 3060
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
о 123868
16.6%
а 119170
15.9%
с 115874
15.5%
к 103434
13.8%
в 89242
11.9%
р 49298
 
6.6%
н 25710
 
3.4%
т 23724
 
3.2%
е 18605
 
2.5%
и 12219
 
1.6%
Other values (21) 66463
8.9%
Uppercase Letter
ValueCountFrequency (%)
М 82253
63.1%
К 20125
 
15.4%
С 7536
 
5.8%
П 5818
 
4.5%
Р 3004
 
2.3%
Е 2924
 
2.2%
Н 2914
 
2.2%
В 1653
 
1.3%
Д 817
 
0.6%
Ч 793
 
0.6%
Other values (11) 2537
 
1.9%
Connector Punctuation
ValueCountFrequency (%)
_ 81868
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7002
100.0%
Space Separator
ValueCountFrequency (%)
3060
100.0%

Most occurring scripts

ValueCountFrequency (%)
Cyrillic 877981
90.5%
Common 91930
 
9.5%

Most frequent character per script

Cyrillic
ValueCountFrequency (%)
о 123868
14.1%
а 119170
13.6%
с 115874
13.2%
к 103434
11.8%
в 89242
10.2%
М 82253
9.4%
р 49298
 
5.6%
н 25710
 
2.9%
т 23724
 
2.7%
К 20125
 
2.3%
Other values (42) 125283
14.3%
Common
ValueCountFrequency (%)
_ 81868
89.1%
- 7002
 
7.6%
3060
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
Cyrillic 877981
90.5%
ASCII 91930
 
9.5%

Most frequent character per block

Cyrillic
ValueCountFrequency (%)
о 123868
14.1%
а 119170
13.6%
с 115874
13.2%
к 103434
11.8%
в 89242
10.2%
М 82253
9.4%
р 49298
 
5.6%
н 25710
 
2.9%
т 23724
 
2.7%
К 20125
 
2.3%
Other values (42) 125283
14.3%
ASCII
ValueCountFrequency (%)
_ 81868
89.1%
- 7002
 
7.6%
3060
 
3.3%
Distinct607
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size946.1 KiB
2025-06-01T13:52:12.337555image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length39
Median length17
Mean length18.824147
Min length11

Characters and Unicode

Total characters2279416
Distinct characters55
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique143 ?
Unique (%)0.1%

Sample

1st row_Москва - Иркутск
2nd row_Москва - Екатеринбург
3rd row_Москва - _Москва
4th rowНижний Новгород - Красноярск
5th rowСанкт-Петербург - Красноярск
ValueCountFrequency (%)
москва 169163
45.9%
121090
32.9%
красноярск 20118
 
5.5%
кострома 9619
 
2.6%
санкт-петербург 8755
 
2.4%
екатеринбург 4662
 
1.3%
касимов 3827
 
1.0%
регион 3212
 
0.9%
новосибирск 2277
 
0.6%
владивосток 1535
 
0.4%
Other values (77) 23966
 
6.5%
2025-06-01T13:52:12.777203image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
247134
10.8%
о 240336
10.5%
а 238919
10.5%
с 235888
10.3%
к 214602
9.4%
в 183343
8.0%
М 171035
7.5%
_ 169163
7.4%
- 133170
 
5.8%
р 91444
 
4.0%
Other values (45) 354382
15.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1472077
64.6%
Uppercase Letter 257872
 
11.3%
Space Separator 247134
 
10.8%
Connector Punctuation 169163
 
7.4%
Dash Punctuation 133170
 
5.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
о 240336
16.3%
а 238919
16.2%
с 235888
16.0%
к 214602
14.6%
в 183343
12.5%
р 91444
 
6.2%
н 50368
 
3.4%
т 37929
 
2.6%
е 33085
 
2.2%
и 25170
 
1.7%
Other values (21) 120993
8.2%
Uppercase Letter
ValueCountFrequency (%)
М 171035
66.3%
К 36660
 
14.2%
С 12535
 
4.9%
П 9790
 
3.8%
Е 5743
 
2.2%
Н 5653
 
2.2%
Р 4854
 
1.9%
В 3252
 
1.3%
Д 1398
 
0.5%
Ч 1386
 
0.5%
Other values (11) 5566
 
2.2%
Space Separator
ValueCountFrequency (%)
247134
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 169163
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 133170
100.0%

Most occurring scripts

ValueCountFrequency (%)
Cyrillic 1729949
75.9%
Common 549467
 
24.1%

Most frequent character per script

Cyrillic
ValueCountFrequency (%)
о 240336
13.9%
а 238919
13.8%
с 235888
13.6%
к 214602
12.4%
в 183343
10.6%
М 171035
9.9%
р 91444
 
5.3%
н 50368
 
2.9%
т 37929
 
2.2%
К 36660
 
2.1%
Other values (42) 229425
13.3%
Common
ValueCountFrequency (%)
247134
45.0%
_ 169163
30.8%
- 133170
24.2%

Most occurring blocks

ValueCountFrequency (%)
Cyrillic 1729949
75.9%
ASCII 549467
 
24.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
247134
45.0%
_ 169163
30.8%
- 133170
24.2%
Cyrillic
ValueCountFrequency (%)
о 240336
13.9%
а 238919
13.8%
с 235888
13.6%
к 214602
12.4%
в 183343
10.6%
М 171035
9.9%
р 91444
 
5.3%
н 50368
 
2.9%
т 37929
 
2.2%
К 36660
 
2.1%
Other values (42) 229425
13.3%
Distinct406
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size946.1 KiB
2025-06-01T13:52:13.042762image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length42
Median length20
Mean length21.824147
Min length14

Characters and Unicode

Total characters2642686
Distinct characters57
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique82 ?
Unique (%)0.1%

Sample

1st row_Москва <--> Иркутск
2nd row_Москва <--> Екатеринбург
3rd row_Москва <--> _Москва
4th rowНижний Новгород <--> Красноярск
5th rowКрасноярск <--> Санкт-Петербург
ValueCountFrequency (%)
москва 169163
45.9%
121090
32.9%
красноярск 20118
 
5.5%
кострома 9619
 
2.6%
санкт-петербург 8755
 
2.4%
екатеринбург 4662
 
1.3%
касимов 3827
 
1.0%
регион 3212
 
0.9%
новосибирск 2277
 
0.6%
владивосток 1535
 
0.4%
Other values (77) 23966
 
6.5%
2025-06-01T13:52:13.457055image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 254260
9.6%
247134
9.4%
о 240336
9.1%
а 238919
9.0%
с 235888
8.9%
к 214602
 
8.1%
в 183343
 
6.9%
М 171035
 
6.5%
_ 169163
 
6.4%
< 121090
 
4.6%
Other values (47) 566916
21.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1472077
55.7%
Uppercase Letter 257872
 
9.8%
Dash Punctuation 254260
 
9.6%
Space Separator 247134
 
9.4%
Math Symbol 242180
 
9.2%
Connector Punctuation 169163
 
6.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
о 240336
16.3%
а 238919
16.2%
с 235888
16.0%
к 214602
14.6%
в 183343
12.5%
р 91444
 
6.2%
н 50368
 
3.4%
т 37929
 
2.6%
е 33085
 
2.2%
и 25170
 
1.7%
Other values (21) 120993
8.2%
Uppercase Letter
ValueCountFrequency (%)
М 171035
66.3%
К 36660
 
14.2%
С 12535
 
4.9%
П 9790
 
3.8%
Е 5743
 
2.2%
Н 5653
 
2.2%
Р 4854
 
1.9%
В 3252
 
1.3%
Д 1398
 
0.5%
Ч 1386
 
0.5%
Other values (11) 5566
 
2.2%
Math Symbol
ValueCountFrequency (%)
< 121090
50.0%
> 121090
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 254260
100.0%
Space Separator
ValueCountFrequency (%)
247134
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 169163
100.0%

Most occurring scripts

ValueCountFrequency (%)
Cyrillic 1729949
65.5%
Common 912737
34.5%

Most frequent character per script

Cyrillic
ValueCountFrequency (%)
о 240336
13.9%
а 238919
13.8%
с 235888
13.6%
к 214602
12.4%
в 183343
10.6%
М 171035
9.9%
р 91444
 
5.3%
н 50368
 
2.9%
т 37929
 
2.2%
К 36660
 
2.1%
Other values (42) 229425
13.3%
Common
ValueCountFrequency (%)
- 254260
27.9%
247134
27.1%
_ 169163
18.5%
< 121090
13.3%
> 121090
13.3%

Most occurring blocks

ValueCountFrequency (%)
Cyrillic 1729949
65.5%
ASCII 912737
34.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 254260
27.9%
247134
27.1%
_ 169163
18.5%
< 121090
13.3%
> 121090
13.3%
Cyrillic
ValueCountFrequency (%)
о 240336
13.9%
а 238919
13.8%
с 235888
13.6%
к 214602
12.4%
в 183343
10.6%
М 171035
9.9%
р 91444
 
5.3%
н 50368
 
2.9%
т 37929
 
2.2%
К 36660
 
2.1%
Other values (42) 229425
13.3%

RU&VAT
Categorical

High correlation  Imbalance 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size946.1 KiB
RU-
117958 
non-RU-VAT0
 
3014
RU-VAT0
 
90
non-RU-
 
28

Length

Max length11
Median length3
Mean length3.2030225
Min length3

Characters and Unicode

Total characters387854
Distinct characters9
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRU-
2nd rowRU-
3rd rowRU-
4th rowRU-
5th rowRU-

Common Values

ValueCountFrequency (%)
RU- 117958
97.4%
non-RU-VAT0 3014
 
2.5%
RU-VAT0 90
 
0.1%
non-RU- 28
 
< 0.1%

Length

2025-06-01T13:52:13.605814image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-06-01T13:52:13.727902image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
ru 117958
97.4%
non-ru-vat0 3014
 
2.5%
ru-vat0 90
 
0.1%
non-ru 28
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
- 124132
32.0%
R 121090
31.2%
U 121090
31.2%
n 6084
 
1.6%
V 3104
 
0.8%
A 3104
 
0.8%
T 3104
 
0.8%
0 3104
 
0.8%
o 3042
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 251492
64.8%
Dash Punctuation 124132
32.0%
Lowercase Letter 9126
 
2.4%
Decimal Number 3104
 
0.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 121090
48.1%
U 121090
48.1%
V 3104
 
1.2%
A 3104
 
1.2%
T 3104
 
1.2%
Lowercase Letter
ValueCountFrequency (%)
n 6084
66.7%
o 3042
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 124132
100.0%
Decimal Number
ValueCountFrequency (%)
0 3104
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 260618
67.2%
Common 127236
32.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 121090
46.5%
U 121090
46.5%
n 6084
 
2.3%
V 3104
 
1.2%
A 3104
 
1.2%
T 3104
 
1.2%
o 3042
 
1.2%
Common
ValueCountFrequency (%)
- 124132
97.6%
0 3104
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 387854
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 124132
32.0%
R 121090
31.2%
U 121090
31.2%
n 6084
 
1.6%
V 3104
 
0.8%
A 3104
 
0.8%
T 3104
 
0.8%
0 3104
 
0.8%
o 3042
 
0.8%
Distinct607
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size946.1 KiB
2025-06-01T13:52:14.099921image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length64
Median length49
Mean length12.287018
Min length3

Characters and Unicode

Total characters1487835
Distinct characters87
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique118 ?
Unique (%)0.1%

Sample

1st rowИрАэро
2nd rowТехномет / ТМТорг
3rd rowРОСБАНК
4th rowАвтокаты
5th rowАвтокаты
ValueCountFrequency (%)
росбанк 23845
 
10.0%
ломбард 18511
 
7.8%
ооо 13203
 
5.6%
12756
 
5.4%
мюз 10790
 
4.5%
калтаев 9931
 
4.2%
все 9931
 
4.2%
оао 8184
 
3.4%
красцветмет 8159
 
3.4%
авто 7418
 
3.1%
Other values (859) 114687
48.3%
2025-06-01T13:52:14.647480image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
116390
 
7.8%
а 104109
 
7.0%
О 99497
 
6.7%
о 81306
 
5.5%
е 75977
 
5.1%
т 69490
 
4.7%
А 63239
 
4.3%
в 60652
 
4.1%
К 58499
 
3.9%
р 56272
 
3.8%
Other values (77) 702404
47.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 808505
54.3%
Uppercase Letter 539840
36.3%
Space Separator 116390
 
7.8%
Dash Punctuation 14909
 
1.0%
Other Punctuation 6616
 
0.4%
Open Punctuation 537
 
< 0.1%
Close Punctuation 537
 
< 0.1%
Decimal Number 495
 
< 0.1%
Initial Punctuation 3
 
< 0.1%
Final Punctuation 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
а 104109
12.9%
о 81306
10.1%
е 75977
 
9.4%
т 69490
 
8.6%
в 60652
 
7.5%
р 56272
 
7.0%
л 45223
 
5.6%
м 45150
 
5.6%
н 39462
 
4.9%
с 36598
 
4.5%
Other values (23) 194266
24.0%
Uppercase Letter
ValueCountFrequency (%)
О 99497
18.4%
А 63239
11.7%
К 58499
10.8%
С 44687
8.3%
Б 40291
7.5%
Р 37985
 
7.0%
Л 31971
 
5.9%
Н 28860
 
5.3%
З 24153
 
4.5%
Т 19072
 
3.5%
Other values (23) 91586
17.0%
Decimal Number
ValueCountFrequency (%)
1 127
25.7%
7 114
23.0%
9 90
18.2%
2 70
14.1%
4 40
 
8.1%
6 31
 
6.3%
5 8
 
1.6%
0 8
 
1.6%
8 6
 
1.2%
3 1
 
0.2%
Other Punctuation
ValueCountFrequency (%)
/ 5439
82.2%
. 1136
 
17.2%
" 32
 
0.5%
, 9
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 14834
99.5%
75
 
0.5%
Space Separator
ValueCountFrequency (%)
116390
100.0%
Open Punctuation
ValueCountFrequency (%)
( 537
100.0%
Close Punctuation
ValueCountFrequency (%)
) 537
100.0%
Initial Punctuation
ValueCountFrequency (%)
« 3
100.0%
Final Punctuation
ValueCountFrequency (%)
» 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Cyrillic 1348345
90.6%
Common 139490
 
9.4%

Most frequent character per script

Cyrillic
ValueCountFrequency (%)
а 104109
 
7.7%
О 99497
 
7.4%
о 81306
 
6.0%
е 75977
 
5.6%
т 69490
 
5.2%
А 63239
 
4.7%
в 60652
 
4.5%
К 58499
 
4.3%
р 56272
 
4.2%
л 45223
 
3.4%
Other values (56) 634081
47.0%
Common
ValueCountFrequency (%)
116390
83.4%
- 14834
 
10.6%
/ 5439
 
3.9%
. 1136
 
0.8%
( 537
 
0.4%
) 537
 
0.4%
1 127
 
0.1%
7 114
 
0.1%
9 90
 
0.1%
75
 
0.1%
Other values (11) 211
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Cyrillic 1348345
90.6%
ASCII 139409
 
9.4%
Punctuation 75
 
< 0.1%
None 6
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
116390
83.5%
- 14834
 
10.6%
/ 5439
 
3.9%
. 1136
 
0.8%
( 537
 
0.4%
) 537
 
0.4%
1 127
 
0.1%
7 114
 
0.1%
9 90
 
0.1%
2 70
 
0.1%
Other values (8) 135
 
0.1%
Cyrillic
ValueCountFrequency (%)
а 104109
 
7.7%
О 99497
 
7.4%
о 81306
 
6.0%
е 75977
 
5.6%
т 69490
 
5.2%
А 63239
 
4.7%
в 60652
 
4.5%
К 58499
 
4.3%
р 56272
 
4.2%
л 45223
 
3.4%
Other values (56) 634081
47.0%
Punctuation
ValueCountFrequency (%)
75
100.0%
None
ValueCountFrequency (%)
« 3
50.0%
» 3
50.0%

Group2
Categorical

High correlation 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size946.1 KiB
2-PM
44412 
4-ATM/CIT
40008 
3-DJ
33452 
1-BN
 
1745
6-O
 
1441

Length

Max length9
Median length4
Mean length5.6400941
Min length3

Characters and Unicode

Total characters682959
Distinct characters19
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row6-O
2nd row2-PM
3rd row4-ATM/CIT
4th row2-PM
5th row2-PM

Common Values

ValueCountFrequency (%)
2-PM 44412
36.7%
4-ATM/CIT 40008
33.0%
3-DJ 33452
27.6%
1-BN 1745
 
1.4%
6-O 1441
 
1.2%
5-CC 32
 
< 0.1%

Length

2025-06-01T13:52:14.773602image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-06-01T13:52:14.901215image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
2-pm 44412
36.7%
4-atm/cit 40008
33.0%
3-dj 33452
27.6%
1-bn 1745
 
1.4%
6-o 1441
 
1.2%
5-cc 32
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
- 121090
17.7%
M 84420
12.4%
T 80016
11.7%
2 44412
 
6.5%
P 44412
 
6.5%
C 40072
 
5.9%
/ 40008
 
5.9%
I 40008
 
5.9%
A 40008
 
5.9%
4 40008
 
5.9%
Other values (9) 108505
15.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 400771
58.7%
Dash Punctuation 121090
 
17.7%
Decimal Number 121090
 
17.7%
Other Punctuation 40008
 
5.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 84420
21.1%
T 80016
20.0%
P 44412
11.1%
C 40072
10.0%
I 40008
10.0%
A 40008
10.0%
D 33452
 
8.3%
J 33452
 
8.3%
B 1745
 
0.4%
N 1745
 
0.4%
Decimal Number
ValueCountFrequency (%)
2 44412
36.7%
4 40008
33.0%
3 33452
27.6%
1 1745
 
1.4%
6 1441
 
1.2%
5 32
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 121090
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 40008
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 400771
58.7%
Common 282188
41.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 84420
21.1%
T 80016
20.0%
P 44412
11.1%
C 40072
10.0%
I 40008
10.0%
A 40008
10.0%
D 33452
 
8.3%
J 33452
 
8.3%
B 1745
 
0.4%
N 1745
 
0.4%
Common
ValueCountFrequency (%)
- 121090
42.9%
2 44412
 
15.7%
/ 40008
 
14.2%
4 40008
 
14.2%
3 33452
 
11.9%
1 1745
 
0.6%
6 1441
 
0.5%
5 32
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 682959
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 121090
17.7%
M 84420
12.4%
T 80016
11.7%
2 44412
 
6.5%
P 44412
 
6.5%
C 40072
 
5.9%
/ 40008
 
5.9%
I 40008
 
5.9%
A 40008
 
5.9%
4 40008
 
5.9%
Other values (9) 108505
15.9%

Group det
Categorical

High correlation 

Distinct18
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size946.1 KiB
4-6 Other
40008 
3-3 In city
32938 
2-4 Other air transport
23363 
2-5 Other onground trans
9001 
2-7 Vault ops
4635 
Other values (13)
11145 

Length

Max length29
Median length24
Mean length14.392939
Min length7

Characters and Unicode

Total characters1742841
Distinct characters44
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row6-6 Spare parts
2nd row2-4 Other air transport
3rd row4-6 Other
4th row2-6 Catalyst
5th row2-6 Catalyst

Common Values

ValueCountFrequency (%)
4-6 Other 40008
33.0%
3-3 In city 32938
27.2%
2-4 Other air transport 23363
19.3%
2-5 Other onground trans 9001
 
7.4%
2-7 Vault ops 4635
 
3.8%
2-2 PM Banks domestic 4002
 
3.3%
1-1 International 1745
 
1.4%
2-3 From mines to aff 1308
 
1.1%
2-6 Catalyst 1258
 
1.0%
2-1 Iter-n (without catalyst) 844
 
0.7%
Other values (8) 1988
 
1.6%

Length

2025-06-01T13:52:15.026988image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
other 72404
20.2%
4-6 40008
11.1%
3-3 32938
9.2%
in 32938
9.2%
city 32938
9.2%
2-4 23363
 
6.5%
air 23363
 
6.5%
transport 23363
 
6.5%
2-5 9001
 
2.5%
onground 9001
 
2.5%
Other values (37) 59914
16.7%

Most occurring characters

ValueCountFrequency (%)
238141
13.7%
t 182601
 
10.5%
r 166401
 
9.5%
- 121935
 
7.0%
n 96647
 
5.5%
e 81795
 
4.7%
a 75952
 
4.4%
h 73434
 
4.2%
O 72404
 
4.2%
3 68467
 
3.9%
Other values (34) 565064
32.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1009802
57.9%
Decimal Number 242180
 
13.9%
Space Separator 238141
 
13.7%
Uppercase Letter 129095
 
7.4%
Dash Punctuation 121935
 
7.0%
Open Punctuation 844
 
< 0.1%
Close Punctuation 844
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 182601
18.1%
r 166401
16.5%
n 96647
9.6%
e 81795
8.1%
a 75952
7.5%
h 73434
7.3%
i 64714
 
6.4%
o 56801
 
5.6%
s 48777
 
4.8%
c 38553
 
3.8%
Other values (10) 124127
12.3%
Uppercase Letter
ValueCountFrequency (%)
O 72404
56.1%
I 36042
27.9%
V 4636
 
3.6%
P 4188
 
3.2%
M 4002
 
3.1%
B 4002
 
3.1%
F 1308
 
1.0%
C 1258
 
1.0%
G 769
 
0.6%
S 208
 
0.2%
Other values (2) 278
 
0.2%
Decimal Number
ValueCountFrequency (%)
3 68467
28.3%
4 63371
26.2%
2 48632
20.1%
6 42915
17.7%
5 9188
 
3.8%
1 4971
 
2.1%
7 4635
 
1.9%
8 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
238141
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 121935
100.0%
Open Punctuation
ValueCountFrequency (%)
( 844
100.0%
Close Punctuation
ValueCountFrequency (%)
) 844
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1138897
65.3%
Common 603944
34.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 182601
16.0%
r 166401
14.6%
n 96647
8.5%
e 81795
 
7.2%
a 75952
 
6.7%
h 73434
 
6.4%
O 72404
 
6.4%
i 64714
 
5.7%
o 56801
 
5.0%
s 48777
 
4.3%
Other values (22) 219371
19.3%
Common
ValueCountFrequency (%)
238141
39.4%
- 121935
20.2%
3 68467
 
11.3%
4 63371
 
10.5%
2 48632
 
8.1%
6 42915
 
7.1%
5 9188
 
1.5%
1 4971
 
0.8%
7 4635
 
0.8%
( 844
 
0.1%
Other values (2) 845
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1742841
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
238141
13.7%
t 182601
 
10.5%
r 166401
 
9.5%
- 121935
 
7.0%
n 96647
 
5.5%
e 81795
 
4.7%
a 75952
 
4.4%
h 73434
 
4.2%
O 72404
 
4.2%
3 68467
 
3.9%
Other values (34) 565064
32.4%

Interactions

2025-06-01T13:52:00.239420image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-01T13:51:58.486601image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-01T13:51:59.181285image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-01T13:51:59.704968image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-01T13:52:00.350666image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-01T13:51:58.616779image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-01T13:51:59.306075image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-01T13:51:59.846250image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-01T13:52:00.470316image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-01T13:51:58.746029image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-01T13:51:59.437091image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-01T13:51:59.983332image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-01T13:52:00.610411image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-01T13:51:59.024917image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-01T13:51:59.571003image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-06-01T13:52:00.110275image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Correlations

2025-06-01T13:52:15.153273image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
DepartmentCodeDescriptionGross WeightGroupGroup detGroup2Last Update UserLiability m₽Mode Of Transport ShRU&VATRevenue (Local)Service Type NameShipment #periodАвторСтатья ДДС
DepartmentCodeDescription1.0000.0690.9990.7190.8930.3830.0000.4730.1060.0470.3160.0980.0730.3690.510
Gross Weight0.0691.0000.0690.0720.0340.0350.6240.2000.0000.6050.0190.0330.0080.0590.042
Group0.9990.0691.0000.8820.8940.4420.0000.4650.0490.0470.3650.0970.0890.4380.543
Group det0.7190.0720.8821.0001.0000.4170.0180.5460.5800.1630.3900.0990.0670.3910.588
Group20.8930.0340.8941.0001.0000.5750.0180.2510.4320.2020.4980.0910.1040.5770.639
Last Update User0.3830.0350.4420.4170.5751.0000.0000.3530.2520.0470.3160.2280.1210.6100.526
Liability m₽0.0000.6240.0000.0180.0180.0001.0000.0000.0140.6220.0000.0730.0000.0080.000
Mode Of Transport Sh0.4730.2000.4650.5460.2510.3530.0001.0000.0890.0610.4490.0720.0840.3060.525
RU&VAT0.1060.0000.0490.5800.4320.2520.0140.0891.0000.2700.1970.0280.0300.2220.115
Revenue (Local)0.0470.6050.0470.1630.2020.0470.6220.0610.2701.0000.0570.1130.0150.0840.043
Service Type Name0.3160.0190.3650.3900.4980.3160.0000.4490.1970.0571.0000.1400.0780.3020.480
Shipment #0.0980.0330.0970.0990.0910.2280.0730.0720.0280.1130.1401.0000.8740.2470.182
period0.0730.0080.0890.0670.1040.1210.0000.0840.0300.0150.0780.8741.0000.1860.147
Автор0.3690.0590.4380.3910.5770.6100.0080.3060.2220.0840.3020.2470.1861.0000.489
Статья ДДС0.5100.0420.5430.5880.6390.5260.0000.5250.1150.0430.4800.1820.1470.4891.000

Missing values

2025-06-01T13:52:00.870559image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2025-06-01T13:52:01.457082image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-06-01T13:52:01.931141image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Shipment #Pickup Date w/o timeDelivery Date w/o timeLast Update DateLast Update UserДата созданияАвторСтатья ДДСShipper NameCity PUCity DLVMode Of Transport ShDepartmentCodeDescriptionService Type NameGross WeightRevenue (Local)periodGroupLiability m₽# ShroutePU CityDLV Citydirectiondirection bothRU&VATCustomerGroup2Group det
070806431/12/2201/01/2318/01/23Скопенко Анна Алексеевна30/12/22Скопенко Анна АлексеевнаПРОЧЕЕИрАэро Авиакомпания АОМоскваИркутскAirOtherДверь-в-Дверь150.040750.002023-01Other0.0000611Москва - Иркутск_МоскваИркутск_Москва - Иркутск_Москва <--> ИркутскRU-ИрАэро6-O6-6 Spare parts
170786429/12/2203/01/2330/01/23Новикова Дарья28/12/22Новикова ДарьяТРЕЙДИНГ ДМТехнометМоскваКаменск-УральскийAirPrec.MetalsДверь в дверь ЭКСПРЕСС25.047481.902023-01Metals1.4730121Москва - Каменск-Уральский_МоскваЕкатеринбург_Москва - Екатеринбург_Москва <--> ЕкатеринбургRU-Техномет / ТМТорг2-PM2-4 Other air transport
270802230/12/2203/01/2331/01/23Косарынская Людмила30/12/22Косарынская ЛюдмилаБАНКИРОСБАНК ПАОМоскваМоскваRoadBanknotesБанкоматы1.0500.002023-01Banknotes8.0000001Москва - Москва_Москва_Москва_Москва - _Москва_Москва <--> _МоскваRU-РОСБАНК4-ATM/CIT4-6 Other
370753228/12/2204/01/2317/01/23Кудрявцева Надежда26/12/22Косарынская ЛюдмилаАВТОКАТАЛИЗАТОРЫЕврокат Волга ОООНижний НовгородКрасноярскRailCatalystДверь-в-Дверь968.078114.212023-01Catalyst6.8110351Нижний Новгород - КрасноярскНижний НовгородКрасноярскНижний Новгород - КрасноярскНижний Новгород <--> КрасноярскRU-Автокаты2-PM2-6 Catalyst
470752428/12/2204/01/2317/01/23Кудрявцева Надежда26/12/22Косарынская ЛюдмилаАВТОКАТАЛИЗАТОРЫЕвромет Север ОООСанкт-ПетербургКрасноярскRailCatalystДверь-в-Дверь2277.0132226.852023-01Catalyst22.1192611Санкт-Петербург - КрасноярскСанкт-ПетербургКрасноярскСанкт-Петербург - КрасноярскКрасноярск <--> Санкт-ПетербургRU-Автокаты2-PM2-6 Catalyst
570769629/12/2204/01/2317/01/23Кудрявцева Надежда27/12/22Косарынская ЛюдмилаАВТОКАТАЛИЗАТОРЫВосток Запад ОООМоскваКрасноярскRailCatalystДверь-в-Дверь2203.098447.702023-01Catalyst20.5634711Москва - Красноярск_МоскваКрасноярск_Москва - Красноярск_Москва <--> КрасноярскRU-Автокаты2-PM2-6 Catalyst
670769729/12/2204/01/2317/01/23Кудрявцева Надежда27/12/22Косарынская ЛюдмилаАВТОКАТАЛИЗАТОРЫЕвромет Запад ОООМоскваКрасноярскRailCatalystДверь-в-Дверь1305.058132.172023-01Catalyst12.9604541Москва - Красноярск_МоскваКрасноярск_Москва - Красноярск_Москва <--> КрасноярскRU-Автокаты2-PM2-6 Catalyst
770769829/12/2204/01/2317/01/23Кудрявцева Надежда27/12/22Косарынская ЛюдмилаАВТОКАТАЛИЗАТОРЫЕвромет ОООМоскваКрасноярскRailCatalystДверь-в-Дверь1449.063846.872023-01Catalyst11.2937121Москва - Красноярск_МоскваКрасноярск_Москва - Красноярск_Москва <--> КрасноярскRU-Автокаты2-PM2-6 Catalyst
870806703/01/2304/01/2318/01/23Скопенко Анна Алексеевна02/01/23Скопенко Анна АлексеевнаПРОЧЕЕИрАэро Авиакомпания АОМоскваИркутскAirOtherДверь-в-Дверь96.022916.672023-01Other2.7980321Москва - Иркутск_МоскваИркутск_Москва - Иркутск_Москва <--> ИркутскRU-ИрАэро6-O6-6 Spare parts
970796930/12/2208/01/2331/01/23Косарынская Людмила29/12/22Косарынская ЛюдмилаБАНКИРОСБАНК ПАОМоскваМоскваRoadBanknotesБанкоматы1.0500.002023-01Banknotes0.0010001Москва - Москва_Москва_Москва_Москва - _Москва_Москва <--> _МоскваRU-РОСБАНК4-ATM/CIT4-6 Other
Shipment #Pickup Date w/o timeDelivery Date w/o timeLast Update DateLast Update UserДата созданияАвторСтатья ДДСShipper NameCity PUCity DLVMode Of Transport ShDepartmentCodeDescriptionService Type NameGross WeightRevenue (Local)periodGroupLiability m₽# ShroutePU CityDLV Citydirectiondirection bothRU&VATCustomerGroup2Group det
12108083122327/05/2503/06/2527/05/25Оленев Александр Сергеевич26/05/25Решимова ЕкатеринаАВТОКАТАЛИЗАТОРЫТД ПДММоскваКрасноярскRailCatalystДверь-в-Дверь1192.0074592.002025-06Catalyst17.00001Москва - Красноярск_МоскваКрасноярск_Москва - Красноярск_Москва <--> КрасноярскRU-ТД ПДМ2-PM2-6 Catalyst
12108183117103/06/2504/06/2526/05/25Кудрявцева Надежда26/05/25Кудрявцева НадеждаПРОЧЕЕМузейное оборудование и сервис Московский филиалМоскваКалининградAirArtДверь-в-Дверь294.000.002025-06Art0.01601Москва - Калининград_МоскваКалининград_Москва - Калининград_Москва <--> КалининградRU-Музейное оборудование и сервис Московский филиал6-O6-1 Art
12108283144203/06/2504/06/2527/05/25Солодова Елена27/05/25Солодова ЕленаБАНКИСбербанк России ( металлы)МоскваЧелябинскAirGoldДверь-в-Дверь0.146412.002025-06Metals1.19001Москва - Челябинск_МоскваЧелябинск_Москва - Челябинск_Москва <--> ЧелябинскRU-Сбербанк2-PM2-2 PM Banks domestic
12108383142128/05/2504/06/2527/05/25Решимова Екатерина27/05/25Решимова ЕкатеринаПРОМЫШЛЕННОСТЬПарус ОООАнгарскГубкинскийRoadCatalystДверь-в-Дверь2600.00316666.672025-06Catalyst54.00001Ангарск - ГубкинскийИркутскНоябрьскИркутск - НоябрьскИркутск <--> НоябрьскRU-Парус ООО2-PM2-6 Catalyst
12108483090530/05/2505/06/2523/05/25Дорофеева Елена23/05/25Дорофеева ЕленаЮВЕЛИРЫЮвелирная группа АЛРОСАМоскваРостов-на-ДонуRoadDJДверь-в-Дверь30.0026018.602025-06D&J15.80001Москва - Ростов-на-Дону_МоскваРостов-на-Дону_Москва - Ростов-на-Дону_Москва <--> Ростов-на-ДонуRU-Ювелирная группа АЛРОСА3-DJ3-3 In city
12108583031504/06/2507/06/2520/05/25Солодова Елена20/05/25Солодова ЕленаАФФ.ЗАВОДЫКрасцветмет ОАОУсть-НераКрасноярскAirPrec.MetalsДверь-в-Дверь24.00127937.712025-06Metals405.00001Усть-Нера - КрасноярскУсть-НераКрасноярскУсть-Нера - КрасноярскКрасноярск <--> Усть-НераRU-Красцветмет ОАО2-PM2-3 From mines to aff
12108683124730/05/2508/06/2526/05/25Дорофеева Елена26/05/25Дорофеева ЕленаЮВЕЛИРЫТорговый центр АтоллНовосибирскРостов-на-ДонуAirDJДверь в дверь ОПТИМ120.0027690.002025-06D&J30.00001Новосибирск - Ростов-на-ДонуНовосибирскРостов-на-ДонуНовосибирск - Ростов-на-ДонуРостов-на-Дону <--> НовосибирскRU-Торговый центр Атолл3-DJ3-3 In city
12108783114705/06/2509/06/2526/05/25Дорофеева Елена26/05/25Дорофеева ЕленаЮВЕЛИРЫЮвелирная группа АЛРОСАРостов-на-ДонуСмоленскRoadDJДверь-в-Дверь30.0036197.302025-06D&J15.80001Ростов-на-Дону - СмоленскРостов-на-ДонуСмоленскРостов-на-Дону - СмоленскСмоленск <--> Ростов-на-ДонуRU-Ювелирная группа АЛРОСА3-DJ3-3 In city
12108883047412/06/2513/06/2522/05/25Кудрявцева Надежда21/05/25Кудрявцева НадеждаПРОЧЕЕМузейное оборудование и сервис Московский филиалМоскваКалининградAirArtДверь-в-Дверь100.00163643.882025-06Art240.90001Москва - Калининград_МоскваКалининград_Москва - Калининград_Москва <--> КалининградRU-Музейное оборудование и сервис Московский филиал6-O6-1 Art
12108982362501/04/2530/06/2513/05/25Новикова Дарья31/03/25Новикова ДарьяТРЕЙДИНГ ДМБРИГ- МЕТ ОООМоскваМоскваRoadPrec.MetalsПеревозка по Москве16.457000.002025-06Metals15.19581Москва - Москва_Москва_Москва_Москва - _Москва_Москва <--> _МоскваRU-БРИГ- МЕТ ООО2-PM2-5 Other onground trans